fix: precompute project group merge suggestion counts (CM-1230)#4250
fix: precompute project group merge suggestion counts (CM-1230)#4250skwowet wants to merge 13 commits into
Conversation
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a precomputed, per–project-group cache of merge-suggestion counts to avoid expensive COUNT(*) queries timing out on large project groups, while keeping filtered counts and list pagination as live queries. It adds a cron refresher plus immediate count adjustments when users merge or dismiss suggestions.
Changes:
- Adds
segmentMergeSuggestionCountstable + DAL helpers to compute, upsert, fetch, and decrement counts per project-group segment. - Adds a cron job to refresh cached counts every 4 hours (only writing when counts change).
- Updates member/org merge-suggestion repositories and merge/no-merge flows to use cached totals for unfiltered single-project-group counts and decrement counts on user actions.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| services/libs/data-access-layer/src/segments/index.ts | Adds DAL functions for project-group segment enumeration, count calculation, cached count upsert/get, common-segment lookup, and decrement helpers. |
| services/libs/common_services/src/services/common.member.service.ts | Decrements cached member merge-suggestion counts after a successful member merge. |
| services/apps/cron_service/src/jobs/refreshSegmentMergeSuggestionCounts.job.ts | New cron job that refreshes cached counts on a 4h schedule. |
| backend/src/services/organizationService.ts | Decrements cached org merge-suggestion counts after org merge and org no-merge actions. |
| backend/src/services/memberService.ts | Decrements cached member merge-suggestion counts after member no-merge actions. |
| backend/src/database/repositories/organizationRepository.ts | Uses cached totals for unfiltered single-project-group counts; adjusts mergeActions filtering in list/count SQL. |
| backend/src/database/repositories/memberRepository.ts | Uses cached totals for unfiltered single-project-group counts; adjusts mergeActions filtering in list/count SQL. |
| backend/src/database/migrations/V1782128525__segment-merge-suggestion-counts.sql | Adds the segmentMergeSuggestionCounts table schema. |
Comments suppressed due to low confidence (1)
backend/src/database/repositories/organizationRepository.ts:820
- Using a mergeActions LEFT JOIN +
(ma.id IS NULL OR ma.state = ERROR)can multiply rows and can incorrectly include a suggestion if there are mergeActions recorded in both directions with different states (the ERROR row still passes). A NOT EXISTS predicate avoids join-multiplication and correctly excludes suggestions when any non-ERROR mergeAction exists.
LEFT JOIN "mergeActions" ma
ON ma.type = :mergeActionType
AND (
(ma."primaryId" = otm."organizationId" AND ma."secondaryId" = otm."toMergeId")
OR (ma."primaryId" = otm."toMergeId" AND ma."secondaryId" = otm."organizationId")
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
…ounts table Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
…ries to use single segment ID Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
…y query Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
|
@skwowet looks good to me, and this can seriously speed up things! ✅ |
|
@ulemons Good point. No, the job isn't running in prod yet. I think the safest rollout would be:
That should avoid any weird results during the initial backfill |
Summary
CM-1230 addressed
membersToMerge/organizationsToMergetimeouts by scoping merge suggestions against rollup rows inmemberSegmentsAgg/organizationSegmentsAgg— matching the selected project-group segment directly instead of expanding to leaf subprojects and hitting the agg tables with hugesegmentId IN (...)lists.That fixed the worst leaf-expansion cases, but unfiltered COUNT requests were still slow for large project groups: even with a single rollup segment ID, Postgres still scanned all merge rows and ran expensive EXISTS checks against very large agg buckets, which could exceed API timeouts.
This PR adds precomputed per–project-group merge suggestion counts (same idea as other agg-backed reads: expensive work offline, cheap read online). Unfiltered badge/list counts are served from
segmentMergeSuggestionCounts; filtered counts still use the live query — that is intentional and fine: the production incident was unfilteredcountOnlyon huge project groups (e.g. LF OSS Index), not filtered data-quality views. Filtered requests (similarity,displayName,memberId/organizationId) keep the existing live COUNT path; they were not timing out before this PR and do not need precomputation.Also restores project / subproject filtering on the merge suggestions list page (Projects dropdown). Segment scoping had regressed to project-group-only; list and live COUNT again use
filter.projectIds/filter.subprojectIdswhen set, with precomputed counts only for unfiltered project-group scope.Counts are refreshed on a schedule and adjusted immediately when users merge or dismiss suggestions.
How it works now
segmentMergeSuggestionCounts(skips write when values are unchanged).memberToMerge/organizationToMerge+ agg EXISTS. Same query shape as before this PR; acceptable at query time because filtered views are not the high-volume unfiltered badge path and were not the timeout source.Changes
segmentMergeSuggestionCountstable (segmentId, member/org counts,updatedAt).segments/index.ts): fetch project groups; calculate/upsert/get counts; common project-group segment lookup; decrement helpers.refresh-segment-merge-suggestion-countsjob (batched parallelism, fail-fast on segment errors).getTotalCount()— precomputed vs live; aligned member/org list + COUNT SQL (NOT EXISTSfor mergeActions, consistent EXISTS); project-group guard for cache reads; simplified org list query (removed CTE / hash dedup).projectIds/subprojectIds); precomputed table remains project-group only when no project filter and no count filters.